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DETAILED ACTION 



Claim Rejections - 35 USC § 102 



The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in a patent granted on an application for patent by another filed in the 
United States before the invention thereof by the applicant for patent, or on an international application 
by another who has fulfilled the requirements of paragraphs (1), (2), and (4) of section 371(c) of this 
title before the invention thereof by the applicant for patent. 

The changes made to 35 U.S.C. 102(e) by the American Inventors Protection Act 
of 1999 (AlPA) do not apply to the examination of this application as the application 
being examined was not (1) filed on or after November 29, 2000, or (2) voluntarily 
published under 35 U.S.C. 122(b). Therefore, this application is examined under 35 
U.S.C. 102(e) prior to the amendment by the AlPA (pre-AlPA 35 U.S.C. 102(e)). 

1. Claims 1-36 are rejected under 35 U.S.C. 102(e) as being anticipated by Kuhn et 
al., (U.S. Patent 6,327,565), hereinafter referred to as Kuhn. 

Regarding claims 1 , 7 and 35, Kuhn discloses a system for speaker adaptation 
that includes the following: a new speaker input (col. 5, lines 26-39, Fig. 3, 40), which 
corresponds to "an input for receiving an input signal derived from a spoken utterance 
that contains at least one speech element that potentially matches the given speech 
element"; a set of HMMs 44 (one for each sound) (col. 5, lines 23-34), which 
corresponds to "a model group associated to the given speech element, said model 
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group comprising a plurality of speech models, each speech model of said plurality of 
speech models being a different representation of the given speech element"; with an 
inherent processing unit to generate an adapted model based in the input (Fig. 3 52 col. 
5, lines 50-57), which corresponds to "a processing unit coupled to the input for 
processing the input signal and the model group to generate a hybrid speech model 
associated to the given speech element, said hybrid speech model being a combination 
of speech models in said plurality of speech models on the basis of the input signal 
derived from the spoken utterance"; and adaptation occurs during recognition with the 
inherent output of the recognition result from the recognizer (col. 2, lines 45-50), which 
corresponds to "an output for releasing a signal indicative of said hybrid speech model 
associated to the given speech element in a format suitable for use by a speech 
recognition device." 

Regarding claim 2, Kuhn teaches everything claimed, as applied above (see 
claim 1 ); in addition, Kuhn teaches the modeling of speech units (such as a phrase, 
word, subword, phoneme or the like) (col. 3, lines 4-7), which corresponds to "the given 
speech element is an element selected from the group consisting of phones, diphones, 
syllables and words." 

Regarding claim 3, Kuhn teaches everything claimed, as applied above (see 
claim 2); in addition, Kuhn teaches the training of a speaker dependent model (one for 
each sound unit) (col. 5, line 30-33), which corresponds to "input signal derived from a 
spoken utterance is indicative of a speaker specific speech model associated to the at 
least one speech element." 
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Regarding claim 4, Kuhn teaches everything claimed, as applied above (see 
claim 3); in addition, Kuhn teaches the creation of a new speaker dependent model (col. 
5, lines 22-57), which corresponds to "hybrid speech model is weighted toward the 
speaker specific speech model." 

Regarding claim 5, Kuhn teaches everything claimed, as applied above (see 
claim 4); in addition, Kuhn teaches that the speaker dependent model serves to 
estimate the linear combination of coefficients that will comprise the adapted model 44 
(col. 5, 50-57), which corresponds to "said hybrid speech model is derived by computing 
a linear combination of the speech models in said model group. 11 

Regarding claim 6, Kuhn teaches everything claimed, as applied above (see 
claim 5). In addition, Kuhn teaches the following: the use of an iterative process where 
multiple speaker dependent inputs can be used during training (col. 5, lines 22-58, in 
particular lines 54-58), which corresponds to "a first input and wherein said input signal 
is a first input signal, said apparatus further comprising: a) a second input for receiving a 
second input signal conveying a data element identifying the given speech element"; 
speaker dependent and speaker independent HMMs (models) for each sound unit (col. 
4, lines 40-46), which corresponds to "b) a database of model groups comprising a 
plurality of model groups, each model group being associated to a respective speech 
element, each model group comprising a set of speech models"; and the construction of 
a new model for the for a given sound (in the supervisor mode) (col. 5, lines 22-58, in 
particular lines 33-36), which corresponds to "said processing unit being further 
operative for extracting from said database of model groups a certain model group 
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associated to the data element received at said second input identifying the given 
speech element." 

Regarding claim 8, Kuhn discloses an algorithm (Fig. 3) for speaker adaptation 
inherenty implemented with a computational unit that includes the following: a new 
speaker input (col. 5, lines 26-39, Fig. 3, 40), which corresponds to "an input for 
receiving an input signal derived from a spoken utterance that contains at least one 
speech element that potentially matches the given speech element"; a set of HMM's 44 
(one for each sound) inherently stored in a memory (col. 5, lines 23-34), which 
corresponds to "a memory unit for storing a model group associated to the given speech 
element, said model group comprising a plurality of speech models, each speech model 
of said plurality of speech models being a different representation of the given speech 
element"; with an inherent processing unit to generate an adapted model based in the 
input (Fig. 3 52 col. 5, lines 50-57), which corresponds to "a processing unit coupled to 
the input for processing the input signal and the model group to generate a hybrid 
speech model associated to the given speech element, said hybrid speech model being 
a combination of speech models in said plurality of speech models on the basis of the 
input signal derived from the spoken utterance"; the adaptation of the models during 
recognition (col. 2, lines 45-50) with the inherent release of the recognition result, which 
corresponds to "an output for releasing a signal indicative of said hybrid speech model 
associated to the given speech element in a format suitable for use by a speech 
recognition device." 
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Regarding claim 9, Kuhn discloses a device for speaker adaptation that includes 
speaker independent and dependent models (col. 2, lines 34-50) which includes the 
following features: a new speaker input (col. 5, lines 26-39, Fig. 3, 40), which 
corresponds to "an input for receiving an input: signal derived from a spoken utterance 
that contains at least one speech element that potentially matches the given speech 
element"; two sets of HMMs inherently stored in a memory, one speaker dependent 44 
and the other adapted 52 (col. 5, lines 23-34), which corresponds to "a model group 
associated to the given speech element, said model group comprising a plurality of 
speech models, each speech model of said plurality of speech models being a different 
representation of the given speech element, said model group comprising two sets of 
speech models namely a first set having speech models of a first type and a second set 
having speech models of a second type, each speech model of a first type in said first 
set being associated to a speech model of the second type in the second set"; an 
inherent processing unit for creating a speaker dependent model and a supervector 
(col. 5, lines 40-49 44 48), which corresponds to "a) processing the input signal and the 
model group to generate a hybrid speech model associated to the given speech 
element, said hybrid speech model being a combination of speech models of the first 
type in said plurality of speech models on the basis of the input signal derived from the 
spoken utterance"; the creation of an adapted model 52 from the supervector (col. 5, 
lines 39-57), which corresponds to "b) processing the hybrid speech model to generate 
a complex speech model associated to the given speech element, said complex speech 
model being a combination of speech models of the second type in said plurality of 
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speech models"; and the inherent release of the recognition result while the training is 
occurring (col. 2, lines 45-50), which corresponds to "an output for releasing a signal 
indicative of said complex speech model associated to the given speech element in a 
format suitable for use by a speech recognition device." 

Regarding claim 10, Kuhn teaches everything claimed, as applied above (see 
claim 9); in addition, Kuhn teaches the use of an adapted model that is modified using 
the speaker dependent model (Fig. 3, col. 5, lines 26-58), which corresponds to "speech 
model of a second type is indicative of a speech model having a higher complexity than 
a speech model of a first type to which it is associated." 

Regarding claim 1 1 , Kuhn teaches everything claimed, as applied above (see 
claim 10); in addition, Kuhn teaches that the Hidden Markov Models (used in Kuhn's 
models) can be used to model speech units such as a phrase, word, subword, phoneme 
or the like (col. 3, lines 4-7), which corresponds to "speech element is indicative of a 
data element selected from the set consisting of phones, diphones, syllables and 
words." 

Regarding claim 12, Kuhn teaches everything claimed, as applied above (see 
claim 1 1 ); in addition, Kuhn teaches that the training can be done in a supervisor mode 
where the training system knows the contents of the training speech in advance (col. 5, 
lines 33-36), which corresponds to "said input signal derived from a spoken utterance is 
indicative of a speaker specific speech model associated to the at least one speech 
element." 
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Regarding claim 13, Kuhn teaches everything claimed, as applied above (see 
claim 12); in addition, Kuhn teaches that the models are modified with speaker specific 
data (Fig. 3, col. 5, lines 23-57), which corresponds to "said hybrid speech model is 
weighted toward the speaker specific speech model." 

Regarding claim 14, Kuhn teaches everything claimed, as applied above (see 
claim 13); in addition, Kuhn teaches that the speaker dependent model 44 serves to 
estimate the linear combination of coefficients that will comprise the adapted model for 
the new speaker (col. 5, lines 44), which corresponds to "said hybrid speech model is 
derived by computing a linear combination of the speech models of the first type." 

Regarding claim 15, Kuhn teaches everything claimed, as applied above (see 
claim 14); in addition, Kuhn teaches that the speaker dependent model serves as an 
estimate of the coefficients that will comprise the adapted model (col. 5, lines 50-67, col. 
6, lines 1-12), which corresponds to "first set of parameters indicative of weights 
associated to speech models of the first type, said complex speech model being derived 
by computing a second linear combination of the speech models of the second type, 
said second linear combination being characterized by a second set of parameters 
indicative of weights associated to speech models of the second type." 

Regarding claim 16, Kuhn teaches everything claimed, as applied above (see 
claim 15); in addition, Kuhn teaches that the speaker dependent model serves as an 
estimate for the coefficients that will comprise the adapted model (col. 5, lines 50-67, 
col. 6, lines 1-12), which corresponds to "said first set of parameters and said second 
set of parameters are indicative of substantially same weights." 
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Regarding claim 17, Kuhn teaches everything claimed, as applied above (see 
claim 10). In addition, Kuhn teaches the following: a supervisor mode indicating that the 
input signal is known (col. 5, lines 33-35), which corresponds to "a) a second input for 
receiving a second input signal indicative of a data element identifying the given speech 
element"; the use of multiple models with corresponding information (Fig. 3, 44 48 42, 
col. 5, lines 24-58), which corresponds to "b) a database of model groups comprising a 
plurality of model groups, each model group being associated to a respective speech 
element, each model group comprising two sets of speech models namely a first set 
having speech models of a first type and a second set having speech models of a 
second type, each speech model of a first type in said first set being associated to a 
speech model of the second type in the second set"; the creation of a new supervector 
with the iterative process used to construct another set of HMMs (inherent selectivity 
associated with a specific speech element if in the supervised mode) (col. 5, lines 22- 
58), which corresponds to "unit being further operative for extracting from said database 
of model groups a certain model group associated to the data element received at said 
second input identifying the given speech element." 

Regarding claim 18, Kuhn teaches techniques and algorithms for speaker 
adaptation based on eigenvoices using multiple model groups (Fig. 3, 44 48 52) where 
the model groups contain HMMs that can be used to represent phrases, words, 
subwords, or phonemes (col. 3, lines 4-8) and inherently implemented on a 
computational device, which corresponds to "a data structure for storing a plurality of 
model groups, each model group being associated to a respective speech element in a 
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phonetic alphabet, each model group comprising a plurality of speech models, each 
model group being suitable for use by a processing device." 

Regarding claim 19, Kuhn teaches everything claimed, as applied above (see 
claim 18); in addition, Kuhn teaches that the speech units can be phrases, words, 
subwords or phonemes or the like (col. 3, lines 4-7), which corresponds to "the given 
speech element is indicative of a data element selected from the set consisting of 
phones, diphones, syllables and words." 

Regarding claim 20, Kuhn teaches everything claimed, as applied above (see 
claim 19); in addition, Kuhn teaches the use of a data structure with multiple model 
groups with the adapted model being the most highly processed (Fig. 3, 44 48 52, col. 
5, lines 23-58), which corresponds to "wherein each model group comprises two sets of 
speech models namely a first set having a plurality of speech models of a first type and 
a second set having a plurality of speech models of a second type, each speech model 
of a first type in said first set being associated to a speech model of the second type in 
the second set, each speech model of the second type being indicative of a speech 
model having a higher complexity that a speech model of the first type to which the 
speech model of the second type is associated." 

Regarding claim 21 and 36, Kuhn discloses an adaptable speech recognition 
system (col. 2, lines 45-50) with the following features: a speaker input (Fig. 3 40) which 
when in supervised mode knows the content of an input in advance (col. 5, lines 33-36), 
which corresponds to "an input for receiving an input signal indicative of a spoken 
utterance that is indicative of at least one speech element"; procedures for training of a 
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speaker dependent recognizer (Fig. 3 42 44, col. 5, lines 39-49), which corresponds to 
"a first processing unit coupled to said input operative for processing the input signal to 
derive from a speech recognition dictionary at least one speech model associated to a 
given speech element that constitutes a potential match to the at least one speech 
element"; procedures to construct a supervector (Fig. 3 46 48 38), which corresponds to 
"a second processing unit coupled to said first processing unit for generating a modified 
version of the at least one speech model or the basis of the input signal"; and 
procedures to construct a new set of HMMs based on the supervector (Fig. 3, 50 52), 
which corresponds to "a third processing unit coupled to said second processing unit for 
processing the input signal on the basis of the modified version of the at least one 
speech model to generate a recognition result indicative of whether the modified version 
of the at least one speech model constitutes a match to the input signal"; and since 
Kuhn's system is a recognition system that automatically adapts during recognition (col. 
2, lines 45-50), Kuhn's system inherently releases recognition results, which 
corresponds to "an output for releasing a signal indicative of the recognition result." 

Regarding claim 22, Kuhn teaches everything claimed, as applied above (see 
claim 21); in addition, Kuhn teaches the processing of the input to produce a speaker 
dependent model of a specific input when in the supervised mode (Fig. 3 40 42 44, col. 
5, 39-49), which corresponds to "wherein said first processing unit is operative for 
generating a speaker specific speech model derived on the basis of the input signal, the 
speaker specific speech model being indicative of the acoustic characteristics of the 
least one speech element." 
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Regarding claim 23, Kuhn teaches everything claimed, as applied above (see 
claim 22); in addition Kuhn teaches that the speaker dependent model 44 serves to 
estimate the linear combination of coefficients that will comprise the adapted model (col. 
5, lines 50-55), which corresponds to "said modified version of the at least one speech 
model is indicative of a hybrid speech model associated to the given speech element." 

Regarding claim 24, Kuhn teaches everything claimed, as applied above (see 
claim 23). In addition, Kuhn teaches the following: the transfer of data between the 
procedures (indicated by the arrows in Fig. 3 between elements 42 44 and 46), which 
corresponds to "coupling member for allowing data exchange between the first 
processing unit and the second processing unit, said coupling member being suitable 
for receiving the speaker specific speech model derived from the input signal"; a model 
group containing HMMs with models of speech sounds (Fig. 3, col. 5, lines 26-39), 
which corresponds to "a model group associated to the given speech element, said 
model group comprising a plurality of speech models, each speech model of said 
plurality of speech models being a different representation of the given speech 
element"; a procedure to construct a new set of HMMs based on the supervector (Fig. 3 
50), which corresponds to "a functional unit coupled to the coupling member for 
processing the speaker specific speech model and the model group to generate the 
hybrid speech model associated to the given speech element, said hybrid speech model 
being a combination of speech models in said plurality of speech models on the basis of 
the speaker specific speech model"; and since Kuhn discloses a speech recognition 
system (col. 2, lines 45-50) it has an inherent means for indicating a particular 



Application/Control Number: 09/468, 1 38 Page 1 3 

Art Unit: 2654 

recognition event on output, which corresponds to "an output coupling member for 
allowing data exchange between the second processing unit and the third processing 
unit, said output coupling member being suitable for releasing a signal indicative of the 
hybrid speech model associated to the given speech element." 

Regarding claim 25, Kuhn teaches everything claimed, as applied above (see 
claim 24); in addition, Kuhn teaches that the dependent model 44 serves to estimate the 
linear combination of coefficients that will comprise the adapted model (col. 5, lines 50- 
54), which corresponds to "said hybrid speech model is weighted toward the speaker 
specific speech model." 

Regarding claim 26, Kuhn teaches everything claimed, as applied above (see 
claim 24); in addition, Kuhn teaches that the dependent model 44 serves to estimate the 
linear combination of coefficients that will comprise the adapted model (col. 5, lines 50- 
54), which corresponds to "said hybrid speech model is derived by computing a linear 
combination of the speech models in said group of speech models." 

Regarding claim 27, Kuhn teaches everything claimed, as applied above (see 
claim 24); in addition, Kuhn teaches the following: transfer of data between the speaker 
dependent building portion of the model and the adapted model (indicated by arrows in 
Fig. 3), which corresponds to "a) a second coupling member for allowing data exchange 
between the first processing unit and the second processing unit, said second coupling 
member being suitable for receiving a data element identifying the given speech 
element"; groups of models (Fig. 3 44 48 52) containing HMMs of speech sounds (col. 
3, lines 3-7, col. 5, lines 50-57), which corresponds to u b) a database of model groups 
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comprising a plurality of model groups, each model group being associated to a 
respective speech element, each model group comprising a set of speech models"; a 
procedure 50 for constructing a new set of HMMs based on the supervector 48 that can 
access the models with the adapted model (Fig. 3 50 52), which corresponds to 
"functional unit being further operative for extracting from said database of model 
groups a certain model group associated to the data element received at said second 
coupling member identifying the given speech element." 

Regarding claim 28, Kuhn teaches everything claimed, as applied above (see 
claim 22); in addition, Kuhn teaches a HMM contained in the adapted model 52 is 
adapted to the speech input and can be constructed by an iterative process (col. 5, lines 
54-56), which corresponds to "said modified version of the at least one speech model is 
indicative of a complex speech model associated to the given speech element. 

Regarding claim 29, Kuhn teaches everything claimed, as applied above (see 
claim 28). In addition, Kuhn teaches the following: the speaker specific model 44 
generated from the input is coupled (arrow between 44 and 46 in Fig. 3) to the next 
stage for further processing, which corresponds to "coupling member for allowing data 
exchange between the first processing unit and the second processing unit, said 
coupling member being suitable for receiving the speaker specific speech model 
derived from the input signal"; a speaker specific model group 44 containing multiple 
HMMs for sounds and an adapted model group 52 with corresponding sounds (col. 5, 
lines 22-57), which corresponds to "a model group associated to the given speech 
element, said model group comprising a plurality of speech models, each speech model 
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of said plurality of speech models being a different representation of the given speech 
element, said model group comprising two sets of speech models namely a first set 
having speech models of a first type and a second set having speech models of a 
second type, each speech model of a first type in said first set being associated to a 
speech model of the second type in the second sef ; a procedure for training speaker 
dependent models (Fig. 3 42) and an adapted model 52 that combines speaker specific 
and speaker independent data (col. 5, lines 39-58), which corresponds to "a) processing 
the speaker specific speech model and the model group to generate a hybrid speech 
model associated to the given speech element, said hybrid speech model being a 
combination of speech models of the first type in said plurality of speech models on the 
basis of the speaker specific speech model"; a procedure to construct new sets of 
HMMs 52 based on a supervector (Fig. 3 50) where the resulting models are a 
combination of the speaker independent and speaker dependent data (col. 5, 22-58), 
which corresponds to "b) processing the hybrid speech model to generate a complex 
speech model associated to the given speech element, said complex speech model 
being a combination of speech models of the second type in said plurality of speech 
models"; and since Kuhn discloses a speech recognition system (col. 2, lines 45-50) it 
has an inherent means for indicating a particular recognition event on output, which 
corresponds to "output coupling member for allowing data exchange between the 
second processing unit and the third processing unit, said coupling member being 
suitable for releasing a signal indicative of the complex speech model associated to the 
given speech element." 
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Regarding claim 30, Kuhn teaches everything claimed, as applied above (see 
claim 29); in addition, Kuhn teaches that the adapted model is based on a linear 
combination of the components from the speaker dependent model which might entail 
multiple iterations (col. 5, lines 39-65), which corresponds to "any speech model of the 
second type is indicative of a speech model having a higher complexity than a speech 
model of the first type to which it is associated." 

Regarding claim 31, Kuhn teaches everything claimed, as applied above (see 
claim 30); in addition, Kuhn teaches that the adapted model is a linear combination of 
the coefficients from the speaker dependent model (col. 5, lines 50-55), which 
corresponds to "wherein said hybrid speech model is weighted toward the speaker 
specific speech model." 

Regarding claim 32, Kuhn teaches everything claimed, as applied above (see 
claim 31); in addition, Kuhn teaches that the adapted model is a linear combination of 
the coefficients from the speaker dependent model (col. 5, lines 50-55), which 
corresponds to "said hybrid speech model is derived by computing a linear combination 
of tire speech models of the first type." 

Regarding claim 33, Kuhn teaches everything claimed, as applied above (see 
claim 32). In addition, Kuhn teaches the use of an iterative procedure where to 
construct a new supervector from the adapted model and there after to construct 
another set of HMMs from which a further adapted model may be constructed (col. 5, 
lines 50-57), which corresponds to "wherein said linear combination is a first linear 
combination and is characterized by a first set of parameters indicative of weights 
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associated to speech models of the first type, said complex speech model being derived 
by computing a second linear combination of the speech models of the second type, 
said second linear combination being characterized by a second set of parameters 
indicative of weights associated to speech models of the second type." 

Regarding claim 34, Kuhn teaches everything claimed, as applied above (see 
claim 33); in addition, Kuhn teaches that the constructing of supervector 48 may be 
accomplished through a computationally simple projection operation or the like (col. 5, 
lines 59-67), which corresponds to "said first set of parameters and said second set of 
parameters is indicative of substantially the same weights." 



Response to Arguments 

2. Applicant's arguments filed in February 3, 2003 have been fully 
considered but they are not persuasive. 

3. Applicant asserts on page 19: 

With respect to independent Claims 1, 7, 8 and 35, Kuhn fails to disclose 
using a plurality of speech models to generate a hybrid speech model, 
where each of the plurality of speech models is a "different representation 
of [a] given speech element." Instead, Kuhn uses a set of "uncorrelated" 
vectors to generate a model for a new user. Kuhn's uncorrelated 
eigenvectors are not different representations of the same speech 
element. As a result, Kuhn fails to show each and every element of 
Applicants' invention as claimed in independent Claims 1,7,8 and 35 (and 
dependent Claims 2-6). (italics added) 

Kuhn teaches that T speakers train speaker dependent models, one model per 

speaker where each model will contain one HMM per sound unit (col. 4, Ins. 20- 
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46). This process will result in T representations for each sound unit (i.e., T 
different representations for the same speech element). 



4. Applicant further asserts on page 1 9: 

Regarding Claim 9, Kuhn fails to disclose using a "first set" of speech 
models to generate a "hybrid speech model" and using a "second set" of 
speech models to generate a "complex speech model." The Office Action 
asserts that a "speaker dependent [model] 44" in Kuhn corresponds to the 
first set of speech models recited in Claim 9 and that an "adapted [model] 
52" in Kuhn corresponds to the second set of speech models recited in 
Claim 9. The Office Action also asserts that the "adapted model 52" also 
corresponds to the complex speech model recited in Claim 9. (Office 
Action, Page 6, Second paragraph). The Applicants respectfully submit 
that the Office Action appears to be using the same component of Kuhn 
(the "adapted model 52") to allege anticipation of both the "second set of 
speech models" and the "complex speech model" recited in Claim 9. 
Therefore, the Office Action has failed to establish a prima facie case of 
anticipation because Kuhn fails to show both "a second set of speech 
models" and "a complex speech model." As a result, Kuhn fails to show 
each and every element of Applicants' invention as recited in independent 
Claim 9 (and dependent Claims 10-17). 

Khan teaches the generation of a speaker dependent model 44 ("first set"), an 

adapted model 52 B ("second set"), and a "further" adapted model using an 

iterative process 54 where the further adapted model corresponds to the 

"complex speech model" recited in claim 9 and is created by constructing a new 

supervector from the adapted model and thereafter to construct another set of 

HMMs from which a further adapted model may be constructed (col. 5, Ins. 23- 

57). 
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5. Applicant asserts on page 20: 

With respect to independent Claim 18, Kuhn fails to disclose a single "data 
structure for storing a plurality of model groups," where each model group 
comprises "a plurality of speech models." As a result, Kuhn fails to teach 
each and every element of Applicants' invention as claimed in Claim 18 
(and dependent Claims 19 and 20). 

The Examiner maintains that the adapted model 52 shown in Figure 3 is 

inherently a data structure and, as described by Kuhn, represents a plurality of 

speech models (see discussion under "Constructing the Eigenvoice Space" (col. 

3) and "Performing the Adaptation" (col. 5)). 



6. Applicant further asserts on page 21 : 

Regarding independent Claims 21 and 36, Kuhn uses a set of 
eigenvectors to generate a model for a new user. Kuhn does not describe 
identifying a speech model that represents a "potential match" to at least 
one speech element and then further processing that speech model. 
Therefore, Kuhn fails to teach each and every element of Applicants' 
invention as recited in independent Claims 21 and 36 (and dependent 
Claims 22-34). 

Kuhn teaches that the training can occur in a supervised mode where the training 
system knows the content of the training speech in advance (col. 5, 32-36), 
which implies identifying speech models as a potential matches. Kuhn also 
describes a situation where adaptation data may have missing sound units 
(certain sound units where not spoken by the new speaker) (col. 6, 13-16), which 
could only occur if there was an attempt to match the adaptation data with a 
speech model. 
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Conclusion 

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 



Any response to this office action should be mailed to: 

Commissioner of Patents and Trademarks 
Washington, D.C. 20231 

or faxed to: 

(703)872-9314 

Hand-delivered responses should be brought to: 

Crystal Park II 

2121 Crystal Drive 

Arlington, VA. 

Sixth Floor (Receptionist) 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dr. V. Paul Harper whose telephone number is (703) 
305-4197. The examiner can normally be reached on Monday through Friday from 8:00 
a.m. to 4:30 p.m. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Marsha D. Banks-Harold, can be reached on (703) 305-4379. The fax 
phone number for the Technology Center 2600 is (703) 872-9314. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the Technology Center 2600 Customer Service office 
whose telephone number is (703) 306-0377. 



VPH/vph 
March 17,2003 



MARSHA D. BANKS-HAROLD 
SUPERVISORY PATENT EXAMINER 
TECHNOLOGY CENTER 2600 



